Analysis of Boyer-Moore-Horspool string-matching heuristic
نویسندگان
چکیده
We investigate the probabilistic behavior of a string-matching heuristic used for searching for the occurrences of a pattern in a random text. Our investigation covers the two cases when the pattern itself is xed or random. Under suitable normalization we show that the total search time is asymptotically normally distributed in the case of xed pattern, whereas in the case of random pattern the distribution of the search time becomes a mixture of degenerate distributions. An instrumental recurrence equation is obtained by shifting the pattern within the text. To handle the sum of dependent random variables appearing in the recurrence, analytic methods based on the behavior of the shift generating function near its dominant singularity in the complex plane are devised to yield moment calculation and the asymptotic distributions. Adaptation of the standard central limit theorem under mixing conditions complements our analytic toolkit.
منابع مشابه
Enhanced Pattern Matching Performance Using Improved Boyer Moore Horspool Algorithm
In computer science, the Boyer–Moore–Horspool algorithm is an algorithm for finding substrings in strings. A pattern matching problem can be classified into software and hardware based on implemental methods. It is important of enhance pattern matching performance. This paper proposes enhanced pattern matching performance using improved Boyer Moore Horspool Algorithm. It combines the determinis...
متن کاملOn obtaining the Boyer-Moore string-matching algorithm by partial evaluation
We present the first derivation of the search phase of the Boyer-Moore stringmatching algorithm by partial evaluation of an inefficient string matcher. The derivation hinges on identifying the bad-character-shift heuristic as a bindingtime improvement, bounded static variation. An inefficient string matcher incorporating this binding-time improvement specializes into the search phase of the Hor...
متن کاملDeriving the Boyer-Moore-Horspool algorithm
The keyword pattern matching problem has been frequently studied, and many different algorithms for solving it have been suggested. Watson and Zwaan in the early 1990s derived a set of well-known solutions from a common starting point, leading to a taxonomy of such algorithms. Their taxonomy did not include a variant of the Boyer-Moore algorithm developed by Horspool. In this paper, I present t...
متن کاملApproximate Boyer-Moore String Matching
The Boyer-Moore idea applied in exact string matching is generalized to approximate string matching. Two versions of the problem are considered. The k mismatches problem is to find all approximate occurrences of a pattern string (length m) in a text string (length n) with at most k mismatches. Our generalized Boyer-Moore algorithm is shown (under a mild independence assumption) to solve the pro...
متن کاملImplementation of exact-pattern matching algorithms using OpenCL and comparison with basic version
In big text-processing tasks, the exact patternmatching problem still remains time consuming. As algorithms asymptotically faster than existing ones cannot be developed, there is a need to use another approach to promote efficiency. Thus, parallel computing is able to significantly speed up the process of the exact pattern-matching problem solving. That is why the current work is focused on par...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Random Struct. Algorithms
دوره 10 شماره
صفحات -
تاریخ انتشار 1997